b"Department of Health and Human Services\n\n        OFFICE OF\n   INSPECTOR GENERAL\n\n\n\n\n USING SOFTWARE TO DETECT\n UPCODING OF HOSPITAL BILLS\n\n\n\n\n                   JUNE GIBBS BROWN\n                    Inspector General\n\n                       August 1998\n                      OEI-01-97-00010\n\x0c                        OFFICE OF INSPECTOR GENERAL\n\nThe mission of the Office of Inspector General (OIG), as mandated by Public Law 95-452, is to\nprotect the integrity of the Department of Health and Human Services programs as well as the\nhealth and welfare of beneficiaries served by them. This statutory mission is carried out through a\nnationwide program of audits, investigations, inspections, sanctions, and fraud alerts. The\nInspector General informs the Secretary of program and management problems and recommends\nlegislative, regulatory, and operational approaches to correct them.\n\n                          Office of Evaluation and Inspections\n\nThe Office of Evaluation and Inspections (OEI) is one of several components of the Office of\nInspector General. It conducts short-term management and program evaluations (called\ninspections) that focus on issues of concern to the Department, the Congress, and the public. The\ninspection reports provide findings and recommendations on the efficiency, vulnerability, and\neffectiveness of departmental programs.\n\nOEI's Boston office prepared this report under the direction of Mark R. Yessian, Ph.D., Regional\nInspector General. Principal OEI staff included:\n\nBOSTON REGION                                                HEADQUARTERS\n\nRussell Hereford, Project Leader                             Mark Krushat\nKenneth Price                                                Tricia Davis\nNicola Pinson                                                Linda Moscoe\n                                                             Brian Ritchie\n\n\n\n   To obtain a copy of this report, please contact the Boston Regional Office by telephone at\n                         (617) 565-1050 or by fax at (617) 565-3751.\n\n\n          Reports are also available on the World Wide Web at our home page address:\n\n                                http://www.dhhs.gov/progorg/oei\n\x0c                   EXECUTIVE SUMMARY\n\nPURPOSE\n\n       The purpose of this study is to test the ability of commercial software products to identify\n       Diagnosis Related Group upcoding in Medicare hospital bills.\n\nBACKGROUND\n\n       Since 1983, Medicare has paid acute care hospitals for the care of its beneficiaries under a\n       prospective payment system using Diagnosis Related Groups (DRGs). In fiscal year 1996,\n       expenditures for inpatient hospital care under this system totaled $77.6 billion.\n\nImproper Payments\n\n       Improper hospital payments are a continuing concern in the Medicare program. In its\n       Chief Financial Officers audit of Medicare, the Office of Inspector General (OIG)\n       estimates that in fiscal year 1997, $4.1 billion of DRG payments were inappropriate due to\n       lack of medical necessity, insufficient or no documentation, or incorrect coding.\n\n       One particular concern is upcoding of hospital bills, the practice of billing for a hospital\n       stay more expensive than the one actually incurred. In previous studies, we found\n       upcoding in DRGs ranging from 7 to 13 percent.\n\nCommercial Upcoding Detection Software\n\n       Dozens of vendors now offer upcoding detection software that locates potentially\n       upcoded DRGs by analyzing electronic files of hospital bills. These products are likely to\n       become increasingly sophisticated as the state of the art of computing races ahead.\n\n       In this inquiry we evaluated the ability of two promising products to identify DRG\n       upcoding. First, we used these products to identify hospital bills with suspected upcoded\n       DRGs. Then we used professional record reviewers to perform a blinded medical review\n       on a sample of cases to assess how well the products predicted DRG upcoding at the\n       hospital, DRG, and case levels.\n\nFINDINGS\n\nHospital Level\n\n       Hospitals identified by the software had an average upcoding rate of 11.5 percent, more\n       than double the 5.3 percent average upcoding rate of the control hospitals.\n\n\n\n\n OEI-01-97-00010                          )))))))))))              Software to Detect Upcoding of Hospital\n                                               i                                                     Bills\n\x0c       However, the software also identified as high upcoders a substantial number of hospitals in\n       which our medical record review identified few or no upcoded cases.\n\nDRG Level\n\n       The software performed best at identifying upcoded cases in three DRGs that show the\n       highest rates of actual upcoding: DRG 87, pulmonary edema and respiratory failure; DRG\n       79, respiratory infections and inflammations; and DRG 144, other circulatory system\n       diagnoses. These three DRGs comprise 3.5 percent of all Medicare discharges, or about\n       350,000 discharges per year.\n\n       However, among the most commonly occurring DRGs, we found that the software was no\n       more effective in identifying upcoded cases than among other DRGs.\n\nCase Level\n\n       The software successfully identified between 50 and 60 percent of cases that were actually\n       upcoded. Over 40 percent of upcoded cases went undetected.\n\n       However, only 10 to 20 percent of cases that the software identified as upcoded were, in\n       fact, upcoded.\n\nCONCLUSIONS\n\n       Our analysis of the software products provides some basis for optimism about the role that\n       such products can play in detecting DRG upcoding. Yet we temper that optimism with\n       strong caution as to the current state of the art of this software and the need to couple its\n       use with other measures in the detection and prevention portfolio.\n\n       The software we examined showed modest success in identifying hospitals with a high rate\n       of upcoding and upcoded cases within a narrowly defined group of DRGs that exhibited\n       the most frequent upcoding. Thus, software could be used to identify hospitals that may\n       need close scrutiny either before or after Medicare pays them. However, because these\n       products were distinctly less successful for most other DRGs, we see only a limited role\n       for these products at the current time.\n\n       It is likely, however, that the software market will continue to develop over time, and that\n       products such as these will advance in sophistication and become more useful as part of a\n       fraud detection strategy. No doubt HCFA will want to stay abreast of opportunities that\n       this technology may present.\n\nVENDOR COMMENTS\n\n       We provided copies of our draft report and our contractor\xe2\x80\x99s report to the three vendors\n       whose software products we tested. We wish to express our appreciation to these\n\n OEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                              ii                                                    Bills\n\x0c      companies for their willingness to let us use their products in this inspection, and for their\n      comments and analysis of our draft report.\n\n      These companies raised two general points in their responses. First, the companies\n      indicated that their products could be modified in ways that address the Medicare\n      population more directly, and that they are continuously updating and enhancing their\n      products. We note that our purpose was not to develop new software, but to test\n      commercially available off-the-shelf software. We did not modify the vendors\xe2\x80\x99 software,\n      nor did we ask them to modify the software or to develop a specific software product for\n      this purpose.\n\n      Second, they questioned the methods we used to test the software. We stand by our\n      methodology. We tested the software in a way that we considered would be useful to an\n      agency such as HCFA. We took the software\xe2\x80\x99s underlying individual claims based\n      approach and aggregated the results of individual claims analysis to the provider level. We\n      then verified the software products\xe2\x80\x99 performance by reviewing cases among a sample of\n      the providers that the software identified as having a high rate of upcoding. In our\n      judgement, this was a practical extension of the software. We used these products in a\n      manner that might identify and focus on providers that bear additional scrutiny in a fraud\n      prevention and detection effort.\n\n      We also address the methodological issues that one of the vendors raised in its response to\n      the report.\n\n      We include the full text of each vendor\xe2\x80\x99s comments in Appendix D.\n\n\n\n\nOEI-01-97-00010                          )))))))))))              Software to Detect Upcoding of Hospital\n                                              iii                                                   Bills\n\x0c                              TABLE OF CONTENTS\n\n                                                                                                                                  PAGE\n\n\nEXECUTIVE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i\n\n\nINTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1\n\n\nFINDINGS\n\n\n          Hospital Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4\n\n\n          DRG Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5\n\n\n          Case Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7\n\n\nCONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8\n\n\nVENDOR COMMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9\n\n\nAPPENDICES\n\n\n          A: Software Vendor Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1\n\n\n          B: Testing Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1\n\n\n          C: Statistical Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1\n\n\n          D: Text of Software Vendors\xe2\x80\x99 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-1\n\n\n          E: Endnotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1\n\n\n\n\n\n OEI-01-97-00010                                                                            Software to Detect Upcoding of Hospital\n                                                                                                                              Bills\n\x0c                            INTRODUCTION\n\n\nPURPOSE\n\n       The purpose of this study is to test the ability of commercial software products to identify\n       Diagnosis Related Group upcoding in Medicare hospital bills.\n\nBACKGROUND\n\n       Since 1983, Medicare has paid hospitals for the care of its beneficiaries under a\n       prospective payment system (PPS) using Diagnosis Related Groups (DRGs). In fiscal\n       year 1996, expenditures for hospital care under this system totaled $77.6 billion.1 Under\n       PPS, payment to hospitals for each Medicare case is based on a hospital-specific payment\n       rate, multiplied by the weight of the DRG to which the case is assigned. Each DRG\n       weight represents the average resources required to care for cases in that particular DRG\n       relative to the average resources used to treat cases in all DRGs.\n\n       Cases are classified into DRGs based on the principal diagnosis, up to eight additional\n       diagnoses, and up to six procedures performed during the stay, as well as the age, sex, and\n       discharge status of the patient. Upon discharge, the physician summarizes information on\n       a discharge face sheet. A hospital coder then reviews the entire medical record and uses\n       that information to assign the most appropriate codes from the International Classification\n       of Diseases, 9th Revision, Clinical Modification (ICD-9-CM). The hospital uses this\n       information to prepare a claim for payment, which it forwards to the Medicare fiscal\n       intermediary. The intermediary applies a series of edits to the claim, then groups the ICD-\n       9-CM codes in the claim into the appropriate DRG for payment to the hospital.\n\nImproper Payments\n\n       Improper hospital payments are a continuing concern for Medicare\xe2\x80\x99s Part A trust fund. In\n       its Chief Financial Officers audit of Medicare, the Office of Inspector General (OIG)\n       estimates that in Fiscal year 1997, $4.1 billion of hospital payments were inappropriate\n       due to lack of medical necessity, insufficient or no documentation, or incorrect coding.2\n       One particular concern is upcoding of hospital bills, the practice of billing for a hospital\n       stay more expensive than the one actually incurred. In previous OIG studies, we found\n       upcoding in DRGs ranging from 7 to 13 percent.3,4\n\nCommercial Upcoding Detection Software\n\n       As the pressure on public and private insurers to eliminate improper payments has risen,\n       the market for software to detect upcoding has experienced rapid growth. Dozens of\n       vendors now offer such software. These products analyze the diagnostic and\n       administrative data from each hospital bill in an electronic claims file to predict whether\n\n OEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                              1                                                     Bills\n\x0c       the DRG contained in the bill is upcoded. Many vendors sell their products \xe2\x80\x9coff the\n       shelf\xe2\x80\x9d\xe2\x80\x94 ready to be installed and utilized with minimal investment and setup time. These\n       products are likely to become increasingly sophisticated as the state of the art of\n       computing races ahead.\n\nThis Inquiry\n\n       Within its fiscal year 1996 Chief Financial Officers audit of Medicare, OIG recommended\n       that HCFA enhance prepayment and postpayment controls by updating computer systems\n       to better detect improper claims. In this inquiry we evaluated the ability of two promising\n       products to identify DRG upcoding through electronic analysis of hospital bills. We chose\n       these two products from a field of 21 vendors who offer similar products. We intended\n       this test to be illustrative of how software might complement HCFA\xe2\x80\x99s existing program\n       integrity initiatives by functioning as one part of a broad strategy for DRG payment\n       safeguarding. We based our evaluation on a blinded medical record review of a national\n       sample of 2,622 Medicare cases from 1996. The review was performed by an independent\n       contractor using accredited medical records professionals.\n\nMETHODOLOGY\n\n       We executed the test in two phases, using a contractor with expertise in statistical\n       sampling and medical record review for highly specialized tasks.\n\n       In phase one we used our contractor to search for vendors of upcoding detection\n       software. The search initially identified 57 vendors whose product description indicated\n       some type of claim auditing software or services. Further research of these 57 vendors\n       reduced the list to 21 vendors who had software that appeared relevant to our study. Out\n       of these 21 software vendors, 3 agreed to participate in our study.5\n\n       Using software from these three vendors, we analyzed 100 percent of Medicare inpatient\n       claims from January through June of 1996 to identify claims that appeared to be upcoded.6\n       Next, we collapsed the output from each software product to generate three lists of\n       hospitals with high predicted rates of upcoding. Through correlation analysis, we\n       discovered a strong relationship between the lists of hospitals generated by two of the\n       software products, while the list from the third product differed significantly. Due to\n       limits on the number of medical records we could review for this study, we decided to\n       focus our inquiry by testing only the output from the two software products whose lists\n       were closely correlated.7 Therefore, as the first step of our sample, we selected 50\n       hospitals that both products predicted had high rates of upcoding. We refer to this group\n       of hospitals as our test sample.\n\n       As a control, we also selected a sample of 20 hospitals that fell into similar size strata as\n       our test sample but did not have high predicted rates of upcoding. This brought the total\n       number of hospitals in our study to 70 \xe2\x80\x94 50 hospitals with high predicted rates of\n       upcoding and 20 hospitals without high predicted rates of upcoding. From each hospital,\n\n OEI-01-97-00010                          )))))))))))              Software to Detect Upcoding of Hospital\n                                               2                                                     Bills\n\x0c      we sampled 40 Medicare inpatient admissions billed under any of the 50 most prevalent\n      DRGs in 1996. This brought the total number of cases in our study to 2,800. We were\n      able to obtain and complete analysis on 2,622 (94 percent) of these cases. Tables C-1\n      through C-4 in Appendix C present data on the characteristics of the hospitals and cases\n      examined in this review.\n\n      In phase two our contractor performed a blinded medical record review of each case. For\n      this task, the contractor used Registered Records Administrators, Accredited Records\n      Technicians, Certified Coding Specialists, and physicians. Based on the contents of the\n      medical record, the contractor derived a new set of ICD-9-CM diagnostic and procedure\n      codes and used them to generate a new DRG for each case. If there was discrepancy\n      between the new DRG and the DRG for which the hospital had billed Medicare, the case\n      was referred for a second blind review to determine the final DRG. If the contractor\n      calculated a final DRG that was less expensive than the DRG the hospital billed Medicare\n      for, we defined a case as upcoded.8 Thus, for each of the 2,622 cases in our review, we\n      knew which had been properly coded and which had been upcoded.\n\n      We then used this information to determine if the two software products successfully\n      predicted whether each case in our sample had an upcoded DRG. We analyzed these data\n      on three levels: by hospital, by DRG, and by case. To perform hospital-level and DRG-\n      level analysis, we aggregated our data by hospital and DRG to compare actual and\n      predicted rates of upcoding. Case-level analysis examined the success of the software in\n      predicting DRG upcoding on a case-by-case basis. Our analyses used t-tests and logistical\n      regression to determine statistical differences.\n\n      A detailed description of the software vendor search and testing methodology appears in\n      Appendices A and B.\n\n      We conducted this study in accordance with the Quality Standards for Inspections issued\n      by the President\xe2\x80\x99s Council on Integrity and Efficiency.\n\n\n\n\nOEI-01-97-00010                        )))))))))))             Software to Detect Upcoding of Hospital\n                                            3                                                    Bills\n\x0c                                     FINDINGS\n\nHospital Level\n\n       Hospitals identified by the software had an average upcoding rate of 11.5 percent,\n       more than double the 5.3 percent average upcoding rate of the control hospitals.\n\n       The 50 hospitals in the test group \xe2\x80\x94 those hospitals that the software identified as having\n       high rates of upcoding \xe2\x80\x94 did in fact exhibit higher upcoding rates than the hospitals in our\n       control sample. The 12 hospitals in which our medical records reviewers found the\n       highest upcoding rates were in the test sample. Seven of these test hospitals had upcoding\n       rates of 25 percent or higher; more than half (26 out of 50) had an upcoding rate of 10\n       percent or higher. (See Table I.)\n\n       The 20 hospitals in the control sample tended to have low upcoding rates. In the sample\n       of cases from control hospitals, our reviewers found upcoding rates below 5 percent in\n       half the hospitals. They found no upcoded cases at all in 5 of the 20 control hospitals.\n\n       We performed logistic regression analysis to control for the effects of additional variables\n       related to the hospitals (e.g., teaching status, ownership) and individual cases (e.g., patient\n       gender and age). Even taking these variables into account, we found that the likelihood of\n       a case being upcoded in hospitals identified by the software was almost twice as high as it\n       was for hospitals in the control sample. Table C-5 in Appendix C presents the full results\n       of this analysis.\n\n                     Table I: Upcoding Rates for Test and Control Hospitals\n                                          Entire Sample    Test Hospitals   Control Hospitals\n                                             (n=70)           (n=50)             (n=20)\n\n             Average upcoding rate\n             (t=3.57 p<.001)\n                                              9.8%            11.5%               5.3%\n\n             Number (percent) with\n                      $\n             upcoding $25%\n                                             7 (10%)         7 (14%)               0 (0%)\n\n             Number (percent) with\n                      $\n             upcoding $10% and <25%\n                                           22 (31%)         19 (38%)               3 (15%)\n\n             Number (percent) with\n             upcoding <10%\n                                           41 (59%)         24 (48%)             17 (85%)\n\n\n       However, the software also identified as high upcoders a substantial number of\n       hospitals in which our medical record review identified few or no upcoded cases.\n\n       In 6 of the 50 hospitals in the test sample, our medical records reviewers found no\n       upcoded cases, even though the software predicted that these hospitals would have high\n OEI-01-97-00010                          )))))))))))              Software to Detect Upcoding of Hospital\n                                               4                                                     Bills\n\x0c      rates of upcoding. In 15 of these 50 hospitals we found upcoding rates of 5 percent or\n      less.\n\n      At the same time, it is worth noting that our reviewers found upcoding rates of 10 percent\n      or higher in 3 of the 20 control hospitals.\n\nDRG Level\n\n       Measuring the Effectiveness of Software in Identifying Upcoded Cases\n\n       The effectiveness of a software product can be measured along two dimensions, referred\n       to as sensitivity and specificity. Each dimension can be expressed as a percentage.\n\n       Sensitivity measures the extent to which the software identifies all cases that have been\n       upcoded. In our sample of 2,622 cases, our independent medical records reviewers\n       determined that 254 cases (9.7 percent) had been upcoded. A software product that was\n       perfectly sensitive would identify all 254 of these cases.\n\n       Specificity assesses the software\xe2\x80\x99s efficiency. Specificity measures the software\xe2\x80\x99s\n       ability to discriminate between those cases that were upcoded and those cases that were\n       not upcoded, i.e., the extent to which the software identifies only those cases that really\n       were upcoded. If the software were perfectly specific, every case that it identified would\n       be upcoded.\n\n       An ideal product would be perfectly sensitive and perfectly specific \xe2\x80\x94 in our review, for\n       example, a perfect product would have selected all 254 upcoded cases and omitted the\n       other 2,368 cases.\n\n       In reality, there often is a trade-off between sensitivity and specificity: To achieve\n       greater sensitivity, the software must cast a wide net; this means that it might identify\n       some cases that were not really upcoded, referred to as \xe2\x80\x9cfalse positives.\xe2\x80\x9d Conversely, to\n       achieve greater specificity, the software risks missing some cases that actually were\n       upcoded; those cases that it misses are referred to as \xe2\x80\x9cfalse negatives.\xe2\x80\x9d\n\n\n\n\n      We examined the performance of the software among two sets of DRGs which we\n      consider potentially high risk to the Medicare program in terms of potential dollars lost:\n      those DRGs in which we found a high level of upcoding and those DRGs which occur\n      most frequently. We examined these DRGs to determine whether the software might be\n      most efficiently utilized by identifying a subset of DRGs that represent potentially high\n      cost to the Medicare program, either because they exhibited high rates of upcoding or\n      because of the sheer volume of cases.\n\n      The software performed best at identifying upcoded cases in DRGs that show the\n      highest rates of actual upcoding.\n\nOEI-01-97-00010                            )))))))))))                 Software to Detect Upcoding of Hospital\n                                                5                                                        Bills\n\x0c      The software was most accurate in identifying cases in the three DRGs with the highest\n      actual rates of upcoding. These 3 DRGs comprise 3.5 percent of all Medicare discharges,\n      or about 350,000 discharges per year:\n\n              DRG 87 (Pulmonary edema and respiratory failure). Our medical records\n              reviewers found an actual upcoding rate of 41 percent. The software products had\n              sensitivity rates of 69 percent and 61 percent, and specificity rates of 60 percent\n              and 47 percent.\n\n              DRG 79 (Respiratory infections and inflammations, age > 17 with complications\n              or comorbidities). Our medical records reviewers found an actual upcoding rate of\n              35 percent. The software products had sensitivity rates of 95 percent and    55\n              percent, and specificity rates of 36 percent each.\n\n              DRG 144 (Other circulatory system diagnoses with complications or\n              comorbidities). Our medical records reviewers found an actual upcoding rate of\n              30 percent. The software products had sensitivity rates of 86 percent and\n              71 percent, and specificity rates of 86 percent and 42 percent.\n\n      For DRGs with lesser\xe2\x80\x94but still high\xe2\x80\x94rates of upcoding, however, the software was less\n      accurate. For example, one software product flagged no cases in the DRGs with the\n      fourth highest upcoding rate (DRG 239, with 24 percent actual upcoding) or the fifth\n      highest upcoding rate (DRG 429, with 23 percent upcoding); the other product was only\n      slightly more successful. For informational purposes, we present data on sensitivity and\n      specificity of the software for the 10 DRGs with highest rates of upcoding in Appendix C,\n      Table C-6.\n\n      One implication arising from this analysis is that once DRGs that exhibit high levels of\n      upcoding have been found \xe2\x80\x94 for example, through ongoing case review and analysis of\n      discharges \xe2\x80\x94 the software products may have a role to play in helping to identify specific\n      cases within those DRGs that merit further scrutiny.\n\n      We also examined how well the software performed in detecting case-specific upcoding\n      among the group of 10 DRGs with the highest upcoding rates, versus the other DRGs we\n      reviewed in this inspection. (See Table C-7 in Appendix C.) We found no statistical\n      difference between these two groups in the software\xe2\x80\x99s sensitivity (i.e., its ability to identify\n      upcoded cases). We did, however, find that the software was more specific among those\n      frequently upcoded DRGs. In other words, those cases that the software did identify\n      tended to be actually upcoded.\n\n      However, among the most commonly occurring DRGs, we found that the software\n      was no more effective in identifying upcoded cases than among other DRGs.\n\n      The 10 most commonly occurring DRGs comprise 10 percent of all Medicare discharges,\n      or about 1 million discharges per year. Within our sample, they comprised 13 percent of\n\n\nOEI-01-97-00010                          )))))))))))               Software to Detect Upcoding of Hospital\n                                              6                                                      Bills\n\x0c       cases reviewed, yet they accounted for 53 percent of upcoded cases.9 Table C-8 in\n       Appendix C provides data on these 10 most commonly occurring DRGs.\n\n       We examined how well the software performed in detecting case-specific upcoding among\n       this group of 10 DRGs, versus the other DRGs we reviewed in this inspection. (See Table\n       C-9 in Appendix C.) We found no statistical difference in sensitivity or specificity for\n       either product in its ability to detect upcoding among these 10 most commonly occurring\n       DRGs, compared with the performance of the products in correctly identifying upcoded\n       cases in other DRGs.\n\nCase Level\n\n       Our sensitivity analysis showed that the software products successfully identified\n       between 50 and 60 percent of cases that were actually upcoded.\n\n       Our medical records reviewers determined that our sample contained 254 cases that had\n       been upcoded. Of these 254 upcoded cases, one product identified 133 (52 percent) of\n       these cases, and the other product identified 147 cases (58 percent). This sensitivity rate\n       has an important implication: over 40 percent of cases that actually were upcoded went\n       undetected by these products.\n\n       However, our specificity analysis showed that only 10 to 20 percent of cases that the\n       software products identified as upcoded were, in fact, upcoded.\n\n       One product identified 685 cases as upcoded, but only 133 (19 percent) of these cases\n       were determined by our reviewers to be upcoded. For the other product, out of 1,284\n       cases it identified as upcoded, 147 (11 percent) were determined by our reviewers to\n       actually be upcoded. Such a low specificity rate reduces the efficiency of the software as a\n       detection tool by requiring that multiple cases be reviewed in order to locate each upcoded\n       case. In essence, for the product with 19 percent specificity, reviewers would need to\n       examine 4 false leads to find 1 case that truly was upcoded. For the product with the 11\n       percent specificity rate, that review level rises to 9 false leads for each truly upcoded case.\n\n\n\n\n OEI-01-97-00010                          )))))))))))              Software to Detect Upcoding of Hospital\n                                               7                                                     Bills\n\x0c                             CONCLUSIONS\n\n\n      The software we examined showed modest success in identifying hospitals with a high rate\n      of upcoding. Those hospitals that the software identified were twice as likely to have\n      upcoded cases as a control group of hospitals. This finding leads us to believe that the\n      software could be used in an ongoing way to identify hospitals that are likely to upcode\n      their Medicare bills.\n\n      There are various approaches as to how the software might be applied at the hospital\n      level. For example, HCFA might wish to use this software as a tool in its post-payment\n      recovery efforts. Based on the results we found, HCFA could use such software to\n      retrospectively identify hospitals in which it would be likely to find a high level of upcoded\n      cases and commensurate overpayment. Alternatively, the agency could use this software\n      to identify hospitals that have previously demonstrated a tendency to upcode, and then\n      perform focused review on cases from these hospitals prior to making payments.\n\n      The software also showed some success within the narrowly defined group of DRGs that\n      exhibited the most frequent upcoding. Because the software was relatively successful in\n      identifying particular cases that were upcoded among these DRGs, its use here could be\n      expected to yield significant returns. For post-payment recovery efforts, HCFA could opt\n      to focus on cases in the upcoded DRGs; analogously, the software could be used\n      prospectively to identify cases in particular DRGs for review prior to payment.\n\n      At the same time, our review leads us to raise caution about these products, particularly at\n      the individual case level. While they worked well for the most frequently upcoded DRGs,\n      our review determined that these products were distinctly less successful for other DRGs.\n      It is for this reason that we see only a limited role, as described above, for these products\n      at the current time.\n\n      The two software products that we reviewed were illustrative of what was available on the\n      market in the Spring of 1997. We believe, however, that it is likely that the software\n      market will continue to develop, and that products such as these will advance in\n      sophistication and expand in their usefulness as part of a fraud detection strategy. No\n      doubt HCFA will want to stay abreast of opportunities that this technology may present.\n\n\n\n\nOEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                             8                                                     Bills\n\x0c                      VENDOR COMMENTS\n\n\nWe provided copies of our draft report and our contractor\xe2\x80\x99s (FMAS) report to the three vendors\nwhose software products we tested. We wish to express our appreciation to these companies for\ntheir willingness to let us use their products in this inspection, and for their review of and\ncomments on our draft report. Their overall comments reflect support for analytical work of this\nnature; however, the vendors express some concerns about our application of the software and\nraise some questions about our methodology. We address their comments here, and we include\nthe complete text in Appendix D.\n\nWe wish to address two points that the vendors raised regarding the manner in which we applied\nthe software. First, each vendor indicates that it is continuously updating and enhancing its\nproducts. One company even notes specifically that the software system we tested could be\nmodified in ways that address the Medicare population more directly. We are confident that\nsoftware enhancements undoubtedly will continue to expand the potential for products such as\nthese to play an important role in fraud prevention and detection.\n\nThe purpose of this inspection, however, was not to develop new software, but to test\ncommercially available off-the-shelf software. Our interest was in determining if products that\nwere on the market at the time we conducted our review (Spring 1997) could prove useful in\nidentifying hospitals that showed a high rate of DRG upcoding. Consequently, we did not modify\nthe vendors\xe2\x80\x99 software, nor did we ask them to modify the software or to develop a specific\nsoftware product for this purpose. Rather, we utilized the vendors\xe2\x80\x99 software packages on an \xe2\x80\x9cas-\nis\xe2\x80\x9d basis.\n\nSecond, the vendors raised concerns about our use of the software to go beyond identification of\nindividual claims that may have been coded incorrectly. We recognize that, to some extent, our\ntest was a modification of the original intent of these software products, which is to detect\nspecific clinical claims that are questionable. In essence, we took this underlying approach and\nextended it. We aggregated the results of the individual claims analysis in order to identify\nhospitals that the software showed have a tendency toward upcoding. We then verified the\nsoftware products\xe2\x80\x99 performance by reviewing cases among a sample of the providers that the\nsoftware identified as having a high rate of upcoding. In our judgement, this was a logical\nextension of the software to the practical realities of how it could be used in the Medicare\nprogram. We used these products in a manner that might identify and focus on providers that\nmay bear additional scrutiny in a fraud prevention and detection effort.\n\nWe also wish to address the methodological issues that Dhrystone Systems raised in its response\nto the report. First, this vendor questions the methods that we used to select our sample, in that\nthe sample contains outliers. We stand by our methodology; indeed we designed the\nmethodology specifically to concentrate on the outlying providers that are most problematic. The\nexperimental group comprised hospitals that the software identified as lying at least two standard\ndeviations above the mean in the proportion of upcoding identified by the software. The control\ngroup comprised all remaining hospitals.\n\n OEI-01-97-00010                         )))))))))))             Software to Detect Upcoding of Hospital\n                                              9                                                    Bills\n\x0cSecond, Dhrystone also questions the appropriateness of our limiting the universe of cases\nreviewed to the 50 most common DRGs. In response, we note that we selected records from\nthese 50 DRGs to focus our review effort. These 50 DRGs comprise nearly 70 percent of all\ndischarges, and over 60 percent of all Medicare PPS reimbursement. Consequently, targeting\nthese 50 DRG strikes us as a prudent means of focusing on where the greatest concentration of\nMedicare dollars lies. We do not generalize these results to broader populations of DRGs or of\nhospitals.\n\nDhrystone also states that the study purports to have been conducted in a double blind manner,\nand questions whether we did, in fact, do so. In response, we note that our review was conducted\nin a blinded manner; but we do not claim it was a double blind study. The initial review was\nconducted by a registered records coder in a fully blinded manner. If the coder found a\ndiscrepancy, the record was then unblinded; the coder then compared the hospital\xe2\x80\x99s reasoning\nwith her reasoning, and arrived at a determination of the appropriate coding. If disagreement\npersisted between the coder and the hospital, a second blind review was conducted, and the\nresults of both reviews compared. In essence, this is a conservative way of conducting a review\nsuch as this. It clearly gives the benefit of any initial doubt to the hospital. We consider such a\nconservative approach to be prudent and likely reflective of any practical application of such\nsoftware by HCFA and the Office of Inspector General.\n\n\n\n\n OEI-01-97-00010                         )))))))))))             Software to Detect Upcoding of Hospital\n                                              10                                                   Bills\n\x0c                                 APPENDIX A\n\n\n                            Software Vendor Search\n\n      Our contractor, FMAS Corporation, consulted with the World Development Group\n      (WDG) to search for commercial software vendors that had products designed to locate\n      DRG upcoding using claims data.10,11\n\n      To begin the search for software vendors, WDG identified and interviewed relevant\n      experts and companies to obtain names of probable software vendors, additional experts,\n      and any other relevant information to assist in the search. Initially, WDG identified 33\n      experts in health informatics, medical expert systems, electronic medical records, and\n      Medicare Part A claims payment.\n\n      WDG sent each expert a fax describing the purpose of the project, the software of interest,\n      and the questions that it would ask in a telephone interview. WDG contacted and\n      interviewed 25 of the experts. This effort resulted in the identification of 11 additional\n      experts, 5 of which WDG interviewed. In total, WDG interviewed 30 experts.\n\n      Concurrent with interviewing experts, WDG conducted a literature and Internet search to\n      identify relevant software vendors. WDG searched the following print sources:\n\n      <       1996 Annual Market Directory Issue. Health Management Technology. 1996.\n      <       Medical Hardware and Software Buyer\xe2\x80\x99s Guide. M.D. Computing 1995; 12 (6).\n      <       Ankrapp, Betty (ed). Health Care Software Sourcebook. Gaithersburg, MD:\n              Aspen Publishers, Inc., 1996.\n      <       Frisch, Bruce (ed). The HCP Directory of Medical Software. Brooklyn, NY:\n              Healthcare Computing Publications Inc., 1996.\n\n      The literature and Internet searches and interviews with industry experts identified 57\n      vendors whose product description indicated some type of claim auditing software or\n      services. WDG faxed a letter to each of these vendors to determine if they sold a product\n      that met the project\xe2\x80\x99s criteria of relevance. This search identified three more vendors who\n      claimed to have a relevant product.\n\n      In total, 21 of the 57 vendors appeared to meet the initial criteria of relevance. In\n      preparation for the telephone interview, WDG sent a fax to these vendors.\n\n      WDG interviewed 20 of the 21 vendors. One vendor did not respond to repeated calls.\n      Of the 20 vendors interviewed, 6 confirmed having a relevant product. Interviews with\n      the 6 confirmed vendors lasted an average of an hour. During these interviews, WDG\n      requested brochures and any other available product literature, as well as a contact name\n      for the software testing phase of the study.\n\nOEI-01-97-00010                         )))))))))))               Software to Detect Upcoding of Hospital\n                                            A-1                                                     Bills\n\x0c      Of the six vendors, five agreed to a test of their software with certain conditions. WDG\n      conducted follow-up interviews to obtain client references and discuss the test that would\n      be conducted. In preparation of these follow-up interviews, WDG developed questions to\n      query vendors about their software, clients, and willingness to test their software. WDG\n      sent each of the 5 vendors a fax describing the purpose of the test and the topics to be\n      discussed during the follow-up interview.\n\n      WDG contacted and interviewed all 5 vendors. Each interview lasted an average of 15\n      minutes. During these interviews, WDG requested as references the names of two payers\n      or fiscal intermediaries. If the vendor did not have payer or fiscal intermediary references,\n      WDG accepted any client references. Subsequently, WDG interviewed two client\n      references for an average of 10 minutes each.\n\n      Vendors\xe2\x80\x99 concerns about the test fell into three categories: 1) the size of our test (5-10\n      million claims records) was too large; 2) vendors were uncertain about how OIG would\n      utilize the results of the test; and 3) OIG\xe2\x80\x99s desired layout of the output was not clear\n      enough.\n\n      Because of these concerns, only three vendors chose to remain in the study and participate\n      in a test of their software.\n\n\n\n\nOEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                            A-2                                                    Bills\n\x0c                                 APPENDIX B\n\n\n                                Testing Methodology\n\n       We executed the study in two phases, utilizing a contractor with expertise in medical\n       record review and statistical sampling for highly specialized tasks. In phase one, we\n       located software products that might detect upcoding, used these products to generate a\n       sample of hospitals, and drew a sample of medical records from these hospitals. In phase\n       two, we performed a DRG validation on each case in our sample and used the results of\n       this validation to determine if the software products used in stage one accurately predicted\n       DRG upcoding.\n\n       We began phase one by issuing a Request for Proposals to locate a contractor with\n       expertise in medical record review and statistical sampling to assist in the study. We\n       contracted with FMAS Corporation, a company with extensive experience performing\n       case review and analysis for the health care programs of the U.S. Department of Health\n       and Human Services and the Department of Defense.12\n\n       FMAS worked with World Development Group (WDG) to locate vendors of software\n       that detects DRG upcoding. From a field of 57 probable vendors, WDG identified 3\n       vendors that had relevant software and were willing to participate in our test. (See\n       Appendix A).\n\nSample Selection\n\n       We used the software from these three vendors to process 100 percent of Medicare\n       Prospective Payment System (PPS) cases from January through June 1996.13 As output,\n       each software flagged cases that it deemed likely to have an upcoded DRG. Next, we\n       made 3 lists of hospitals with high predicted rates of upcoding by collapsing each\n       software\xe2\x80\x99s output by hospital. Through correlation analysis, we discovered a strong\n       relationship between the lists of hospitals from two of the software, while the list from the\n       third software differed significantly. This meant that we would have to draw two separate\n       samples to have a sample of hospitals that was representative of hospitals identified by all\n       three software. Thus, due to limits on the number of medical records we could review for\n       this study, we decided to focus our inquiry by testing only the output from the two\n       software whose lists were closely correlated.14\n\n       To build our experimental (test) sample, we first selected hospitals that either of the two\n       software indicated had a predicted upcoding rate of the mean rate plus two standard\n       deviations. This process led to a group of 299 hospitals, which we stratified into three\n       groups according to number of Medicare discharges in the 6-month file we analyzed: 300\n       or fewer discharges, 301 to 1,000 discharges, and over 1,000 discharges. Next, in\n\n\n OEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                             B-1                                                    Bills\n\x0c      proportion to the total number of hospitals in each stratum, we randomly selected a total\n      of 50 hospitals from across the 3 strata.\n\n      From each hospital, we then randomly selected 40 Medicare cases billed under any of the\n      50 DRGs that were most commonly used across the country during fiscal year 1996. As a\n      control sample, we executed the same sampling strategy to select 800 cases from 20\n      hospitals that had did not have high predicted rates of upcoding. This brought our total\n      sample to 2,800 cases: 2,000 of which were from hospitals that had high predicted rates of\n      upcoding, 800 of which were from hospitals that did not have high predicted rates of\n      upcoding. We then merged claims data from each case against Medicare\xe2\x80\x99s Enrollment\n      Data Base (EDB) to obtain beneficiary name and the Online Survey Certification and\n      Reports (OSCAR) system to obtain hospital name and address. We used this information\n      to mail medical record request letters and case listings to the administrator of each hospital\n      in our sample. Hospitals sent medical records to the OIG, where we logged them, gave\n      them a quality check, and assigned each a tracking number. We then sent the records to\n      FMAS for DRG coding validation.\n\nDRG Coding Validation\n\n      During phase two of the study the contractor, FMAS, performed a DRG coding validation\n      on 2,622 (94 percent) of the 2,800 records in our sample. FMAS, using Registered\n      Records Analysts and Accredited Records Technicians, performed a blinded record\n      review, in which the original ICD-9-CM and DRG codes were hidden. This review\n      generated new ICD-9-CM codes and a new DRG code for each case in the sample. When\n      FMAS completed reviewing a record, it compared the new codes to the previously hidden\n      codes used by the hospital. Below is the DRG reconciliation process:\n\n      If FMAS\xe2\x80\x99 codes and the hospital\xe2\x80\x99s codes matched, FMAS noted the DRG as correctly\n      coded by the hospital. Depending on the specific ICD-9-CM codes assigned by FMAS, it\n      assigned one of the following two reconciliation reason codes to the case:\n\n      1.      Confirm: Face Sheet, UB-92, FMAS codes and DRGs match.\n\n      2.      DRGs match, but there is some variance in codes.\n\n      If FMAS\xe2\x80\x99 codes initially disagreed with those of the hospital, FMAS still noted the\n      hospital\xe2\x80\x99s DRG as correctly coded by the hospital if its reviewer agreed with the hospital\xe2\x80\x99s\n      coding after performing an unblinded reconciliation review. FMAS\xe2\x80\x99 reviewer then\n      assigned one of the following reconciliation codes to the case:\n\n      3.\t     DRGs differed because more than one diagnosis could have been the principal\n              diagnosis according to guidelines and hospital selected principal diagnosis leading\n              to lower-weighted DRG. FMAS did not recode or regroup these cases either in\n              software or on its hardcopy worksheet.\n\n\n\nOEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                            B-2                                                    Bills\n\x0c      4.\t     DRGs differed because of a judgement-call situation not covered by guidelines or\n              Coding Clinic. FMAS\xe2\x80\x99 reviewer gave the hospital the benefit of the doubt. FMAS\n              did not recode or regroup these cases either in software or on its hardcopy\n              worksheet.\n\n      5.\t     DRGs differed but FMAS\xe2\x80\x99 reviewer, upon reviewing hospital\xe2\x80\x99s codes/DRG, noted\n              that the hospital\xe2\x80\x99s DRG was correct. This was the only reconciliation reason\n              category that FMAS recoded and regrouped in software and on the hardcopy\n              worksheet so that its final DRG matched the initial hospital DRG.\n\n      6.\t     UB-92 DRG differed but hospital face sheet matched FMAS\xe2\x80\x99 DRG. FMAS did\n              not recode or regroup these cases either in software or on its hardcopy worksheet.\n              This category was selected whenever the codes on the face sheet would have led\n              to the same DRG as the FMAS DRG, but the UB-92 DRG and related codes were\n              different.\n\n      7.\t     FMAS reserved this reconciliation code for potential additional reconciliation\n              reasons, but did not use it during the study.\n\n      Whenever the DRGs differed after reconciliation, FMAS assigned the following\n      reconciliation reason code to the case:\n\n      8.\t     DRGs differ. Upon review of hospital\xe2\x80\x99s DRG codes, FMAS\xe2\x80\x99 reviewer confirmed\n              that FMAS\xe2\x80\x99 DRG was correct based upon coding guidelines and Coding Clinic.\n              FMAS did not recode or regroup these cases either in software or on its hardcopy\n              worksheet. FMAS recorded all applicable DRG variance reasons and one DRG\n              variance type (described below) on its DRG variance worksheet. FMAS then\n              completed a second blinded review of the case using a different reviewer.\n\n              Variance types for reconciliation reason 8:\n\n              Misspecification: The narrative principal diagnosis, a secondary diagnosis, or a\n              procedure is not supported by the medical record.\n\n              Miscoding: The medical records department selected an incorrect ICD-9-CM\n              numeric code for a correct narrative diagnosis or procedure.\n\n              Resequencing: The hospital substituted a secondary diagnosis for the correctly\n              attested and coded principal diagnosis.\n\n              Other: The hospital made another type of error (such as incorrect discharge\n              status) that led to DRG variance but cannot be categorized as numbers 1-3 above.\n\n\n\n\nOEI-01-97-00010                         )))))))))))             Software to Detect Upcoding of Hospital\n                                            B-3                                                   Bills\n\x0cOIG Analysis\n\n      FMAS sent data for the completed medical record reviews to OIG in electronic format,\n      keyed by our tracking number. We merged these data with the original inpatient claims\n      data and additional administrative data to create our analytical files for the study.\n\n      We analyzed these files on three levels: by hospital, by DRG, and by case. To perform\n      hospital-level and DRG-level analyses, we aggregated our data by hospital and DRG to\n      compare actual and predicted rates of upcoding. Case-level analysis examined the success\n      of the software in predicting DRG upcoding on a case-by-case basis. We used t-tests\n      and logistical regression to determine statistical differences. We performed data analysis\n      using SAS software.15\n\n\n\n\nOEI-01-97-00010                        )))))))))))             Software to Detect Upcoding of Hospital\n                                           B-4                                                   Bills\n\x0c                                     APPENDIX C\n\n\n                                        Statistical Tables\n\n                                       TABLE C-1\n                          CHARACTERISTICS OF HOSPITALS REVIEWED\n                                      Control Sample            Test Sample               Total\n                                          (n=20)                  (n=50)               Sample (n=70)\n                                             n    (%)                n    (%)               n (%)\n      Number of Beds\n        1-99                               14 (70.0)               42    (84.0)            56 (80.0)\n         100-299                             5 (25.0)                4     (8.0)            9 (12.9)\n         300+                                1     (5.0)             4     (8.0)            5     (7.1)\n      Teaching Status\n        Teaching                             3 (15.0)              10    (20.0)            13 (18.6)\n         Nonteaching                       17 (85.0)               40    (80.0)            57 (81.4)\n      Location\n        Metropolitan                       11 (55.0)               10    (20.0)            21 (30.0)\n         Nonmetropolitan                     9 (45.0)              40    (80.0)            49 (70.0)\n      Control\n        For profit                           3 (15.0)                3     (6.0)            6     (8.6)\n         Nonprofit                         14 (70.0)               20    (40.0)            34 (48.6)\n         Government                          3 (15.0)              27    (54.0)            30 (42.9)\n      Number of\n      Discharges, 1/96-6/96\n         1-300                               9 (45.0)              32    (64.0)            41 (58.6)\n         301-1000                            8 (40.0)              13    (26.0)            21 (30.0)\n         1001+                               3 (15.0)                5   (10.0)             8 (11.4)\n      Source: OIG analysis of the FY 1996 Medicare Provider Analysis and Review (MEDPAR) file and data\n      from the Online Survey Certification Reports (OSCAR) system.\n\n\n\n\nOEI-01-97-00010                                  )))))))))))               Software to Detect Upcoding of Hospital\n                                                    C-1                                                      Bills\n\x0c                                                  TABLE C-2\n                   COMPARISON OF TEST SAMPLE WITH ALL HOSPITALS WITH\n                          HIGH PREDICTED RATES OF UPCODING\n                                                      Test Sample         Total High Predicted\n                                                        (n=50)              Group (n=299)\n                                                          n     (%)               n      (%)\n\n                  Number of Beds\n                    1-99                                42    (84.0)           219      (73.2)\n                     100-299                              4     (8.0)           49      (16.4)\n                     300+                                 4     (8.0)           31      (10.4)\n                  Teaching Status\n                    Teaching                            10    (20.0)            62      (20.7)\n                     Nonteaching                        40    (80.0)           237      (79.3)\n                  Location\n                    Metropolitan                        10    (20.0)            89      (29.8)\n                     Nonmetropolitan                    40    (80.0)           210      (70.2)\n                  Control\n                    For profit                            3     (6.0)           30      (10.0)\n                     Nonprofit                          20    (40.0)           124      (41.5)\n                     Government                         27    (54.0)           145      (48.5)\n                  Number of\n                  Discharges, 1/96-6/96\n                     1-300                              32    (64.0)           155      (51.8)\n                     301-1,000                          13    (26.0)           109      (36.5)\n                     1,001+                               5   (10.0)            35      (11.7)\n                  Source: OIG analysis of the FY 1996 Medicare Provider Analysis and Review\n                  (MEDPAR) file and data from the Online Survey Certification Reports (OSCAR)\n                  system.\n\n\n\n\nOEI-01-97-00010                                 )))))))))))                  Software to Detect Upcoding of Hospital\n                                                   C-2                                                         Bills\n\x0c                                      TABLE C-3\n                   HOSPITAL CHARACTERISTICS BY CASE CHARACTERISTICS\n                                 FOR CASES REVIEWED\n                                    Control Sample            Test Sample                 Total\n                                       (n=744)                 (n=1,878)                (n=2,622)\n                                           n    (%)               n      (%)              n    (%)\n\n        Number of Beds\n          1-99                          531 (71.4)           1,583     (84.3)        2,114 (80.6)\n           100-299                      175 (23.5)             156       (8.3)         331 (12.6)\n           300 +                         38     (5.1)          139       (7.4)         177     (6.8)\n        Teaching Status\n          Teaching                      108 (14.5)             372     (19.8)          480 (18.3)\n           Nonteaching                  636 (85.5)           1,506     (80.2)        2,142 (81.7)\n        Location\n          Metropolitan                  407 (54.7)             383     (20.4)          790 (30.1)\n           Nonmetropolitan              337 (45.3)           1,495     (79.6)        1,832 (69.9)\n        Control\n          For profit                    119 (16.0)             120       (6.4)         239     (9.1)\n           Nonprofit                    511 (68.7)             770     (41.0)        1,281 (48.9)\n           Government                   114 (15.3)             988     (52.6)        1,102 (42.0)\n        Source: OIG analysis of the FY 1996 Medicare Provider Analysis and Review (MEDPAR) file and\n        data from the Online Survey Certification Reports (OSCAR) system.\n\n\n\n\nOEI-01-97-00010                                )))))))))))                Software to Detect Upcoding of Hospital\n                                                  C-3                                                       Bills\n\x0c                                                 TABLE C-4\n                      BENEFICIARY CHARACTERISTICS FOR CASES REVIEWED\n                                    Control Sample             Test Sample                 Total\n                                       (n=744)                  (n=1,878)                (n=2,622)\n                                           n    (%)                n     (%)               n    (%)\n\n           Age (years)\n             <65                        106 (14.3)              268     (14.3)          374 (14.3)\n              65-74                     196 (26.3)              489     (26.0)          685 (26.1)\n              75-84                     269 (36.2)              695     (37.0)          964 (36.8)\n              85+                       173 (23.3)              426     (22.7)          599 (22.9)\n           Sex\n             Male                       313 (42.1)              827     (44.0)        1,140 (43.5)\n              Female                    431 (57.9)            1,051     (56.0)        1,482 (56.5)\n           Race\n             White                      648 (87.1)            1,573     (83.8)        2,221 (84.7)\n              Black                       59    (7.9)           201     (10.7)          260     (9.9)\n              Other                       28    (3.8)             87     (4.6)          115     (4.4)\n              Unknown                      9    (1.2)             17     (0.9)            26    (1.0)\n           Source: OIG analysis of the FY 1996 Medicare Provider Analysis and Review (MEDPAR) file.\n\n\n\n\nOEI-01-97-00010                                )))))))))))                 Software to Detect Upcoding of Hospital\n                                                  C-4                                                        Bills\n\x0c                                     TABLES C-5\n\n                        RESULTS OF LOGISTIC REGRESSION MODEL\n\n\n                                  TABLE C-5A\n                    ODDS RATIOS ESTIMATES FOR STATISTICALLY\n                            SIGNIFICANT VARIABLES\n                                                    90% Confidence Interval\n                        Variable           Estimate   Lower       Upper\n              Facility in Selected Group     1.94      1.41        2.66\n              Teaching Hospital              0.52      0.37        0.74\n              Publicly Owned                 1.79      1.39        2.32\n              High Case Mix Index            2.59      1.96        3.43\n              Male                           0.68      0.54        0.87\n              Age 75 to 84                   1.40      1.07        1.85\n              Age 85+                        1.60      1.18        2.17\n              Flagged by Product A           3.49      2.73        4.46\n              Flagged by Product B*          1.25      0.99        1.58\n\n             *This variable was not significant.\n\n\n\n                                                  TABLE C-5B\n                                DEPENDENT VARIABLE IN LOGISTIC REGRESSION MODEL\n                          Case Actually Upcoded as Determined by Our Medical Records Reviewers\n\nChange                    1 = DRG Upcoded (N=254)         0 = DRG not Upcoded (N=2,368)\n\n\n\n\nOEI-01-97-00010                         )))))))))))            Software to Detect Upcoding of Hospital\n                                           C-5                                                   Bills\n\x0c                                      TABLE C- 5C\n                   INDEPENDENT VARIABLES IN LOGISTIC REGRESSION MODEL\n                                                Facility Characteristics\n\nProfit         1 = For profit                 0 = Non profit\nPublic         1 = Public                     0 = Non profit\nLocation       1 = Nonmetropolitan            0 = Metropolitan\nTeaching       1 = Teaching                   0 = Nonteaching\nSmallbed       1 = 1-99 beds                  0 = 100-299 beds\nBigbed         1 = 300 + beds                 0 = 100-299 beds\nFewDC          1= 300 or fewer discharges     0 = 301 - 1,000 discharges\nManyDC         1 = Over 1,000 discharges      0 = 301 - 1,000 discharges\nLowCMI         1 = CMI less than 0.9          0 = CMI between .9 and 1.1\nHigh CMI       1 = CMI over 1.1               0 = CMI between .9 and 1.1\nExpSamp        1= Facility in test group of hospitals    0=Facility in control group of hospitals\n                                                Patient Characteristics\n\nGender         1 = Male                       0 = Female\nBlack          1 = Black                      0 = White\nOther          1 = Other                      0 = White\nUnknown        1 = Unknown                    0 = White\nYoung          1 = Under 65                   0=65-74\nSeven5         1 = 75-84                      0=65-74\nEight5         1 = 85 and Older               0=65-74\n                                                 Case Characteristics\n\nSurgical       1 = Surgical Claim             0 = Nonsurgical Claims\n                                               Software Characteristics\n\nA_Hit          1 = Flagged by Software A      0=Not flagged by Software A\nB_Hit          1= Flagged by Software B       0= Not flagged by Software B\n\n\n\n OEI-01-97-00010                           )))))))))))               Software to Detect Upcoding of Hospital\n                                              C-6                                                      Bills\n\x0c                                   TABLE C-6\n       SOFTWARE PERFORMANCE ON THE 10 DRGS WITH HIGHEST RATES OF UPCODING\n                                         Cases             Percent\n DRG                                                                      Product A                  Product B\n                                        Reviewed           Upcoded\n (% of Medicare discharges)        (% of reviewed cases)             Sensitivity Specificity   Sensitivity   Specificity\n 87 Pulmonary edema &\n                                           32\n    respiratory failure (0.6%)                              41%        69%         60%           62%           47%\n                                         (1.2%)\n\n 79 Respiratory infections &\n                                           186\n    inflammations age >17                                   35%        95%         36%           55%           36%\n                                         (7.1%)\n    w/cc (2.2%)\n 144 Other circulatory system\n                                           23\n     diagnoses w/cc (0.7%)                                  30%        86%         86%           71%           42%\n                                         (0.9%)\n\n 239 Pathological fractures &\n                                           25\n     musculoskeletal & conn                                 24%         0%          0%           33%           18%\n                                         (1.0%)\n     tiss malignancy (0.5%)\n 429 Organic disturbances &\n                                           13\n     mental retardation (0.4%)                              23%         0%         N/A*          33%           50%\n                                         (0.5%)\n\n 416 Septicemia age >17\n     (2.0%)                                84               20%        94%         23%           82%           22%\n                                         (3.2%)\n\n 475 Respiratory system\n                                           26\n     diagnosis with ventilator                              19%       100%         20%          100%           26%\n                                         (1.0%)\n     support (0.9%)\n 188 Other digestive system\n                                           17\n     diagnoses age >17 w/cc                                 18%         0%          0%           67%           20%\n                                         (0.6%)\n     (0.6%)\n 121 Circulatory disorders\n                                           54\n     w/AMI & C.V. comp                                      15%        25%         15%           88%           15%\n                                         (2.1%)\n     disch alive (1.5%)\n 316 Renal failure (0.8%)\n                                           34               15%        20%         33%           20%            6%\n                                         (1.3%)\n\n\n*Note:\n\nSensitivity = N/A when we found no upcoded cases within a DRG, i.e., the denominator in our sensitivity\n\ncalculation is zero.\n\n\nSpecificity = N/A when the software did not flag any cases within a DRG, i.e., the denominator of our specificity\n\ncalculation is zero.\n\n\n\n\n\n  OEI-01-97-00010                                 )))))))))))                  Software to Detect Upcoding of Hospital\n                                                     C-7                                                         Bills\n\x0c                                      TABLE C-7\n                SOFTWARE PERFORMANCE ON TEN DRGS WITH HIGHEST RATES OF\n                             UPCODING VERSUS ALL OTHERS\n                                                               MEAN      STD DEV        t       P<\n\n               PRODUCT A SENSITIVITY\n\n                 Top 10 Upcoded DRGs                          0.4894       0.4367\n\n                 All Others (25 with a score*)                0.2763       0.3351     1.558     n/s\n\n               PRODUCT A SPECIFICITY\n\n                 Top 10 Upcoded DRGs (9 with a score*)        0.3045       0.2787\n\n                 All Others (25 with a score)                 0.0891       0.1038     2.265     .10\n\n\n               PRODUCT B SENSITIVITY\n\n                 Top 10 Upcoded DRGs                          0.6115       0.2596\n\n                 All Others (25 with a score)                 0.4808       0.3560     1.203     n/s\n\n               PRODUCT B SPECIFICITY\n\n                 Top 10 Upcoded DRGs                          0.2819       0.1479\n\n                 All Others (36 with a score)                 0.0594       0.0826     4.562     .05\n\n*Note:\n\nWhen a DRG may not have a sensitivity score: DRGs with no upcoding will not have a sensitivity score, as the\n\nsensitivity denominator, the number of upcoded cases, is zero. Fifteen of the 50 DRGs in our sample had no\n\nupcoding.\n\n\nWhen a DRG may not have a specificity score: DRGs that had no cases flagged by the software will not have a\nspecificity score, as the specificity denominator, the number of flagged cases, is zero. Sixteen DRGs had no cases\nflagged by Product A. Four DRGs had no cases flagged by Product B.\n\n\n\n\n  OEI-01-97-00010                                )))))))))))                  Software to Detect Upcoding of Hospital\n                                                    C-8                                                         Bills\n\x0c                                     TABLE C-8\n           SOFTWARE PERFORMANCE ON THE 10 MOST COMMONLY OCCURRING DRGS\n DRG                                           Cases Reviewed            Product A               Product B\n (% of Medicare discharges)                  (% of reviewed cases) Sensitivity Specificity Sensitivity Specificity\n\n 127 Heart failure & shock (6.3%)\n                                                     245               0%          0%          55%           5%\n                                                    (9.3%)\n\n 89 Simple pneumonia & pleurisy age\n                                                     299\n    >17 w/cc (4.0%)                                                   11%          4%          67%           8%\n                                                   (11.4%)\n\n 14 Specific cerebrovascular disorders\n                                                     129\n    except TIA (3.4%)                                                 58%         14%          58%           8%\n                                                    (4.9%)\n\n 88 Chronic obstructive pulmonary\n                                                     163\n    disease (3.3%)                                                     0%          0%           0%           0%\n                                                    (6.2%)\n\n 209 Major joint & limb reattachment\n                                                      47\n     procedures- lower extremity (3.2%)                              N/A*         N/A*         N/A           0%\n                                                    (1.8%)\n\n 79 Respiratory infections &\n                                                     186\n    inflammations age >17 w/cc (2.2%)                                 95%         36%          55%          36%\n                                                    (7.1%)\n\n 174 G.I. hemorrhage w/cc (2.2%)\n                                                      82              25%          7%          38%           8%\n                                                    (3.1%)\n\n 182 Esophagitis, gastroent & misc digest\n                                                     105\n     disorders age >17 w/cc (2.1%)                                    10%          5%          30%           7%\n                                                    (4.0%)\n\n 296 Nutritional & misc metabolic\n                                                     108\n     disorders age >17 w/cc (2.1%)                                    36%         26%          50%          10%\n                                                    (4.1%)\n\n 112 Percutaneous cardiovascular\n                                                       9\n     procedures (2.0%)                                                N/A         N/A          N/A           0%\n                                                    (0.3%)\n\n\n\n*Note:\n\nSensitivity = N/A when we found no upcoded cases within a DRG, i.e., the denominator in our sensitivity\n\ncalculation is zero.\n\n\nSpecificity = N/A when the software did not flag any cases within a DRG, i.e., the denominator of our specificity\n\ncalculation is zero.\n\n\n\n\n\n  OEI-01-97-00010                                )))))))))))                  Software to Detect Upcoding of Hospital\n                                                    C-9                                                         Bills\n\x0c                                      TABLE C-9\n                 SOFTWARE PERFORMANCE ON TEN MOST COMMON DRGS VERSUS\n                                     ALL OTHERS\n                                                              MEAN       STD DEV        t       P<\n\n               PRODUCT A SENSITIVITY\n\n                10 Most Common DRGs (8 with a score*)         0.2944      0.3314\n\n                All Others (27 with a score)                  0.3499      0.3896     0.364      n/s\n\n               PRODUCT A SPECIFICITY\n\n                10 Most Common DRGs (9 with a score*)         0.1159      0.1327\n\n                All Others (27 with a score)                  0.1529      0.2028     0.484      n/s\n\n\n               PRODUCT B SENSITIVITY\n\n                10 Most Common DRGs (8 with a score)          0.4405      0.2128\n\n                All Others (27 with a score)                  0.5412      0.3611     0.745      n/s\n\n               PRODUCT B SPECIFICITY\n\n                10 Most Common DRGs                           0.0817      0.1050\n\n                All Others (36 with a score)                  0.1150      0.1430     0.684      n/s\n\n\n*Note:\n\nWhen a DRG may not have a sensitivity score: DRGs with no upcoding will not have a sensitivity score, as the\n\nsensitivity denominator, the number of upcoded cases, is zero. Fifteen of the 50 DRGs in our sample had no\n\nupcoding.\n\n\nWhen a DRG may not have a specificity score: DRGs that had no cases flagged by the software will not have a\nspecificity score, as the specificity denominator, the number of flagged cases, is zero. Sixteen DRGs had no cases\nflagged by Product A. Four DRGs had no cases flagged by Product B.\n\n\n\n\n  OEI-01-97-00010                                )))))))))))                  Software to Detect Upcoding of Hospital\n                                                    C - 10                                                      Bills\n\x0c                        APPENDIX D\n\n\n                  Software Vendors\xe2\x80\x99 Comments\n\n\n\n\n\nOEI-01-97-00010            )))))))))))   Software to Detect Upcoding of Hospital\n                              D-1                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-2                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-3                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-4                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-5                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-6                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-7                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-8                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D-9                                          Bills\n\x0cOEI-01-97-00010   )))))))))))   Software to Detect Upcoding of Hospital\n                     D - 10                                       Bills\n\x0c                                 APPENDIX E\n\n\n                                            Notes\n\n1. Department of Health and Human Services, Health Care Financing Administration, Office of\nthe Actuary. Personal communication April, 1998.\n\n2. Department of Health and Human Services, Office of Inspector General, Report on the\nFinancial Statement Audit of the Health Care Financing Administration for Fiscal Year 1997,\nA-01-97-00520, May 1998.\n\n3. Department of Health and Human Services, Office of Inspector General, National DRG\nValidation Study Special Report on Coding Accuracy, OAI-12-88-01010, February 1988.\n\n4. Department of Health and Human Services, Office of Inspector General, National DRG\nValidation Study Update: Summary Report, OEI-12-89-00190, August 1992.\n\n5. The main issues of concern for those declining to participate fell into three categories: 1) the\nsize of our test (5-10 million claims records) was too large; 2) vendors were uncertain about how\nOIG would utilize the results of the test; and 3) OIG\xe2\x80\x99s desired layout of the output was not clear\nenough.\n\n6. We used the Medicare Provider Analysis and Review (MEDPAR) file as input for the software\nproducts. This file contains diagnostic, billing, and beneficiary demographic data for each stay in\nan inpatient hospital by a Medicare beneficiary. Our test ran approximately 6 million MEDPAR\nrecords through each software product.\n\n7. The fact that output from one vendor\xe2\x80\x99s software differed significantly and that we decided not\nto test it is in no way a reflection on the potential merit of that software.\n\n8. Although not the purpose of this evaluation, we also kept track of cases that were undercoded\n(i.e., cases in which the hospital billed for a less expensive DRG than it should have). Our review\nfound that out of 2,622 cases, 124 cases (4.73 percent) were undercoded while 254 cases (9.69\npercent) were upcoded.\n\n9. The 10 most frequent DRGs in Medicare comprise a higher percentage of the discharges in our\nsample compared to the all Medicare discharges (13 percent versus 10 percent) due to our\nsampling strategy. We sampled only among the top 50 most common DRGs.\n\n10. FMAS Corporation. 11300 Rockville Pike. Rockville, MD 20852.\n\n11. World Development Group, Incorporated. 5101 River Road, Suite 1913. Bethesda, MD\n20816-1574.\n\n12. FMAS Corporation. 11300 Rockville Pike, Rockville, MD 20852.\n\n OEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                            E-1                                                     Bills\n\x0c13. This represents about 6 million admissions. We used the Medicare Provider Analysis and\nReview (MEDPAR) file as input.\n\n14. The fact that output from one product differed significantly and that we decided not to test it\nis in no way a reflection on the potential merit of that product.\n\n15. SAS Institute, Inc. SAS Campus Drive, Cary, NC 27513.\n\n\n\n\n OEI-01-97-00010                         )))))))))))              Software to Detect Upcoding of Hospital\n                                            E-2                                                     Bills\n\x0c"